Sentiment Detection with Character n-Grams

نویسندگان

  • Tino Hartmann
  • Sebastian Klenk
  • Andre Burkovski
  • Gunther Heidemann
چکیده

Automatic detection of the sentiment of a given text is a difficult but highly relevant task. Application areas range from financial news, where information about sentiments can be used to predict stock movements, to social media, where user recommendations can determine success or failure of a product. We have developed a methodology, based on character ngrams, to detect sentiments encoded in text. In the course of this paper we will present the founding idea and the algorithms as well as a usage scenario with an evaluation. We discuss the the obtained results in detail and a compare them with those of other popular sentiment detection methodologies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Approaches for Sentiment Classification on Lithuanian Internet Comments

Despite many methods that effectively solve sentiment classification task for such widely used languages as English, there is no clear answer which methods are the most suitable for the languages that are substantially different. In this paper we attempt to solve Internet comments sentiment classification task for Lithuanian, using two classification approaches – knowledge-based and supervised ...

متن کامل

On the Impact of Sentiment and Emotion Based Features in Detecting Online Sexual Predators

According to previous work on pedophile psychology and cyberpedophilia, sentiments and emotions in texts could be a good clue to detect online sexual predation. In this paper, we have suggested a list of high-level features, including sentiment and emotion based ones, for detection of online sexual predation. In particular, since pedophiles are known to be emotionally unstable, we were interest...

متن کامل

PAN 2017: Author Profiling - Gender and Language Variety Prediction

We present the results of gender and language variety identification performed on the tweet corpus prepared for the PAN 2017 Author profiling shared task. Our approach consists of tweet preprocessing, feature construction, feature weighting and classification model construction. We propose a Logistic regression classifier, where the main features are different types of character and word n-gram...

متن کامل

Enhanced Twitter Sentiment Classification Using Contextual Information

The rise in popularity and ubiquity of Twitter has made sentiment analysis of tweets an important and well-covered area of research. However, the 140 character limit imposed on tweets makes it hard to use standard linguistic methods for sentiment classification. On the other hand, what tweets lack in structure they make up with sheer volume and rich metadata. This metadata includes geolocation,...

متن کامل

LASSA: Emotion Detection via Information Fusion

DUE TO THE COMPLEXITY OF EMOTIONS IN SUICIDE NOTES AND THE SUBTLE NATURE OF SENTIMENTS, THIS STUDY PROPOSES A FUSION APPROACH TO TACKLE THE CHALLENGE OF SENTIMENT CLASSIFICATION IN SUICIDE NOTES: leveraging WordNet-based lexicons, manually created rules, character-based n-grams, and other linguistic features. Although our results are not satisfying, some valuable lessons are learned and promisi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011